General Information




Table 1.1: Key Mutations that define the strain
Gene Nucleotide Mutations Amino Acid Changes
ORF1ab C3266T, T6953C, C5387A, 11288:11296 deletion T1001I, I2230T, A1708D
S A23062T, C23270A, A23402G, C23603A, C23708T, T24505G, G24913C DEL69-70, DEL144Y, N501Y, A570D, D614G, P681H, T716I, S982A, D1118H
N GAT28279CTA, C28976T D3L, S235F
ORF8 G28047T, C27971T, A28110G R52I, Q27_, Y73C
Figure 1.1: genetic distance (root-to-tip), a measure of evolutionary changes, plotted for strain and non-strain (related) samples (excluding other well known VOCs e.g. B.1.135).

State Prevalence


Figure 2.1: spatial (geographical) prevalence of the B117 strain across California.
Figure 2.2: temporal (over time) prevalence of the strain across California.

National Prevalence


Figure 3.1: the spatial (geographical) prevalence of the strain across the US.
Figure 3.2: the temporal (over time) prevalence of the strain across the US.

Global Prevalence


Figure: the spatial (geographical) prevalence of the strain across the world.
Figure 4.2: the temporal (over time) prevalence of the strain across the world.

Notes on Sampling


As figure 3.2 indicates, the majority of B.1.1.7 genomes identified in the US (so far), were identified by S-gene target failures (SGTF) in community-based diagnostic PCR testing. Since it was not an unbiased approach, it does not indicate the true prevalence of the B117 lineage in the US. This only tells us that the lineage is present in the US.
P.S: estimates of true prevalence in the US are discussed in this post

Figure 5: a simple illustration of how genomic surveillance of COVID-19 samples could allow us to elucidate an increasingly clear picture of how the virus is evolving and spreading. The pictures above are electromagnetic microscopy images of SARS-CoV-2 (credit: NIAID) that are "crappified" (salt & pepper noise) to varying degrees depending on the rate of COVID-19 sequencing at each location. As a reference, we include a clear picture on the right to indicate that a 5% genomic sampling rate would be an ideal (first) objective to be able to observe statistically significant phenomena. across the world.

Comments


Research laboratories across the US are encouraged to contribute to COVID-19 genomic sequencing efforts. More detailed information can be found here.

Specifically when uploading genomes to GISAID or GenBank, please indicate if the sample was identified via S-gene target failures (SGTF). This can be indicated under the fields "purpose_of_sequencing" (GISAID) or "Additional host information". This will help in identifying the true prevalence of the lineage across the country.